Combining task-dependent information with auditory attention cues for prominence detection in speech

نویسندگان

  • Ozlem Kalinli
  • Shrikanth S. Narayanan
چکیده

Auditory attention is a highly complex mechanism that involves the process of low-level acoustic features of sound together with higher level cognitive rules. In this paper, a novel method that combines biologically inspired auditory attention cues with higher level lexical and syntactic information is proposed to model task-dependent influences on a given task. The feature maps are extracted from sound at multi-scales by mimicking the processing stages in the human auditory system, and converted to low-level auditory gist features. Then, the auditory attention model biases the gist features based on the task to maximize target detection. The top-down task-dependent influence of lexical and syntactic information is incorporated into the model using a probabilistic approach. The combined model is tested to detect prominent syllables in speech using the BU Radio News Corpus. The model achieves 88% prominence detection accuracy at syllable level, which is comparable to reported human performance on this task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attentional Demand of Speech in Children and Adolescents with Developmental Stuttering

Background & Objective: Stuttering is a prevalent disorder in children and adolescents. Because attention is the only fuel resource for cognitive functions and the language have high cognitive functions, then it is possible that speech difficulties are related to attention deficit. The purpose of this study was to investigate the attentional demand of speech in children and adolescents with dev...

متن کامل

Title: Cognitive Processing of Audiovisual Cues to Prominence

This article addresses two related questions regarding the cognitive processing of audiovisual markers of prominence in spoken utterances: (1) how important are visual cues to prominence from the face with respect to verbal cues? and (2) are there differences between different facial areas in their cue value for prosodic prominence? The first perception experiment tackles the relation between a...

متن کامل

Eyebrow movement as a cue to prominence

INTRODUCTION Speech communication is inherently multimodal in nature. While the auditory modality often provides the phonetic information necessary to convey a linguistic message, the visual modality can qualify the auditory information providing segmental cues on place of articulation, prosodic information concerning prominence and phrasing and extralinguistic information such as signals for t...

متن کامل

Facial expression and prosodic prominence: Effects of modality and facial area

This article addresses two related questions regarding the perception of facial markers of prominence in spoken utterances: (1) how important are visual cues to prominence from the face with respect to auditory cues? and (2) are there differences between different facial areas in their cue value for prosodic prominence? The first perception experiment tackles the relation between auditory and v...

متن کامل

سایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی

Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008